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Abstract 

An example is given of a qubit quantum channel which requires four inputs to maximize 
the Holevo capacity. The example is one of a family of channels which are related to 3-state 
channels. The capacity of the product channel is studied and numerical evidence presented 
which strongly suggests additivity. The numerical evidence also supports a conjecture about 
the concavity of output entropy as a function of entanglement parameters. However, an example 
is presented which shows that for some channels this conjecture does not hold for all input states. 
A numerical algorithm for finding the capacity and optimal inputs is presented and its relation 
to a relative entropy optimization discussed. 



e-mail addresses: masahitoQqci.jst.go.jp, imai@is.s.u-tokyo. ac.jp, keiji@nii.acjp, 
MaryBeth.Ruskai@tufts.edu, shimono@is.s.u-tokyo. ac.jp 

*ERATO Quantum Computation and Information Project, JST, Daini Hongo White Bldg. 201, 5-28-3, Hongo, 
Bunkyo-ku, Tokyo 113-0033, Japan 

^Department of Computer Science, University of Tokyo, 7-3-1, Hongo, Bunkyo-ku, Tokyo, 113-0033, Japan 
"'"National Institute of Informatics (Nil), 2-1-2 Hitotsubashi, Chiyoda-ku, Tokyo 101-8430, Japan 
^Department of Mathematics, Tufts University, Medford, Massachusetts 02155 USA 

^partially supported by the National Security Agency (NSA) and Advanced Research and Development Activity 
(ARDA) under Army Research Office (ARO) contract number DAAD19-02-1-0065, and by the National Science 
Foundation under Grant DMS-0314228. 



1 



1 Introduction 



The Holevo capacity C{T) of a 1-qubit quantum channel T is defined as the supremum over aU 
possible ensembles of 1-qubit density matrices pi and probability distribution pi of 

s{T{p))-Y,mS{r{pi)) 

i 

where p = Y2i PiPi is the average input and S{a) = — Tr((T log a) denotes the von Neumann entropy. 
The Holevo capacity gives the maximum rate at which classical information can be transmitted 
through the quantum channel [SJ HH] using product inputs, but permitting entangled collective 
measurements. It is a consequence of Caratheodory's Theorem and the convex structure of this 
problem (as discussed in the next section) that the above supremum can be replaced with the 
maximum over four input pairs of {pi,Pi)- (Davies seems to have been the first to recognize 
the relevance of Caratheodory's Theorem to problems of this type in quantum information theory; 
explicit application to quantum capacity optimization appeared in It was demonstrated in 
that there exist qubit channels requiring three input states to attain the maximum. However, it 
was left open whether or not there are 1-qubit channels requiring four input states to achieve the 
maximum. This paper shows that such 4-input channels do exist by presenting an example. The 
computation of this capacity is a nonlinear programming problem. Unlike the classical channel 
capacity computation, this problem is much harder, especially in a point that the classical case is 
the maximization of a concave function while the quantum case is the maximization of a function 
which is concave with respect to probability variables, as in the classical case, and is convex with 
respect to state variables. As for algorithms to compute the capacity by utilizing the special 
structure of the problem, developed an alternating-type algorithm, by extending the well- 
known Arimoto-Blahut algorithm for the classical channel capacity, and is implemented in to 
check the additivity. Use of interior-point methods is suggested in 0. A method is presented in 
|26j for computing the capacity by combining linear programming techniques, including column 
generation, with non-linear optimization. In this paper, we present an approximation algorithm 
to compute the capacity of a 1-qubit channel; our algorithm plays a key role in finding a 4-state 
channel numerically. 

Although C(r) plays an important role in quantum information theory, it is not known whether 
or not using entangled inputs might increase the capacity. This is closely related to the question of 
the additivity of C(r C?) F), which is now known |18| ll^^l 177] to be equivalent to other conjectures 
including additivity of entanglement of formation. In addition to being of interest in their own right, 
4-state channels are good candidates for testing the additivity conjecture of the Holevo capacity for 
qubit channels. We present numerical evidence for additivity which, in view of special properties of 
the channels, gives extremely strong evidence for additivity of both capacity and minimal output 
entropy for qubit channels. Both results would follow from a new conjecture (which appeared 
independently in 0) about concavity of entropy as a function of entanglement parameters. Using 
a different channel, we show that this conjecture is false, at least in full generality. 

The paper is organized as follows. Basic background, definitions and notation for convex analysis 
and qubit channels is presented in Sections 2 and 3, respectively. Numerical results for the 4-state 
channel and the algorithm used to obtain them are described in Sections 4 and 5. Some intuition 
about the properties of 3-state and 4-state channels is presented in Section 6 and shown to lead to 
additional examples of 4-state channels. In Section 7, different views of the capacity optimization 
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are discussed and shown to be related to a relative entropy optimization. The additivity analysis 
and counterexample to the concavity conjecture are given in Section 8. Throughout this paper, the 
base of the logarithm is 2. 



2 Convex Analysis 

The function to be maximized in the Holevo capacity has a special form, to which general convex 
analysis may be applied. Based on [SOj, this section discusses the problem in this form. 

Suppose D is a d-dimensional bounded, closed convex set in R"^, and / is a closed, concave 
function from D to R. We are interested in the following infinite programming problem. 

F= sup {f{x)-Y,Pifix^)) (1) 

where x = YliPi^'^i^ 'l2iPi ~ 1' ^^'^ Pi — 0- This infinite mathematical programming problem can 
be reduced to a finite mathematical programming with d+1 pairs of {xi,pi) as follows. 

For such a closed, concave function g over D, its closure of convex hull function cl conv g 
is the greatest convex function majorized by g (p. 36, p. 52 in [201 )• In our case, further using 
Caratheodory's Theorem (Theorem 17.1 in [201)) it is expressed as 

I' d+1 d+1 d+1 >. 

cl conv g{x) = min <^ ^pig{xi) : x = '^piXi, = 1, Xi e D, pi > {i = 1, . . . ,d + 1) [ 

^ i=l 1=1 1=1 ' 

It is then seen that the problem is reduced to the following Fenchel-type problem (cf. Fenchel's 
duality theorem, section 31, ,20^). 

max(/(a;) — cl conv f{x)), (2) 

By virtue of nice properties of minimizing convex functions (e.g.. Theorem 27.4 in ^]\,), the opti- 
mality of a solution to this problem is well-known: 

Lemma 1 x is optimum in ^ if and only if there is ^ £ R*^ such that, for any x ^ D, 

{x — x) + cl conv f{x) < cl conv f{x) < f{x) < {x — x) + f{x). 

Furthermore, when f is strictly concave, there is a unique optimum solution. 

The above discussions can be summarized in the form of problem as follows: 

Corollary 1 In the infinite mathematical programming problem ^\), the supremum can he replaced 
with the maximum over d + 1 pairs of {xi,pi). If there exist d+1 affinely independent points Xi 
(i = 1, . . . , d+1) such that a unique hyperplane passing through {xi, f{xi)) [xi £ D, i = 1, . . . , d+1) 
in H'^^'^ is a supporting hyperplane to the convex set { {x,y) \ x £ D, cl conv f{x) < y < f{x) } 
from below, and, for these xi {i = 1, . . . ,d+ 1), 

d+1 d+1 d+1 

max{/(^Pia;i) - ^Pif{xi) | = 1, > 0} 

i=l i=l i=l 

is attained with pi > for all i = 1, . . . ,d + 1, then a set of d + 1 pairs of {xi,pi) is an optimum 
solution to (0j. 
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3 Set-up 



In the calculation of channel capacity for state on C , the convex set D is the set of density 
matrices, i.e., the set oi d x d positive semi-definite matrices with trace 1. This is isomorphic to a 
convex subset of R*^^^^. A channel T{p) is described by a special type of linear map on the set of 
density matrices, namely, one which is also completely positive and trace-preserving. 

In the case of qubits, it is well-known that the set D of density matrices is isomorphic to the 
unit ball in via the Bloch sphere representation. We will use the notation p{x, y, z) to denote 
the density matrix \ [I + xax + yo-y + zaz]- It was shown in [S] that, up to specification of bases, a 
qubit channel can be written in the form 

T[p{x, y, z))] = p{\ix + ti, \2y + t2, A3Z + ^3). (3) 

which gives an affine transformation on the Bloch sphere. In fact, it maps the Bloch sphere 
{{x,y,z) I + + < 1 } to an ellipsoid with axes of lengths Ai,A2,A3 and center ti,t2,t3- 
Complete positivity poses additional constraints on the parameters {A^, t^} which are given in j22j . 

The strict concavity of S{p) implies that S'[r(/9)] is also strictly concave for channels which are 
one-to-one. In the case of qubits, this will hold unless the channel maps the Bloch sphere into a 
one- or two-dimensional subset, which can only happen when one of the parameters A^ = 0. 

4 Numerical results 

The theory in Section[21can be used to calculate the capacity with f{p) = S[T{p)]. We are interested 
in qubit channels with all A^ 7^ so that strict concavity holds. Then the optimization problem as 
formulated in ^ has a unique solution. However, in the form as restricted in Corollary it 
may have multiple optimum solutions when the hyperplane passes through more than d + 1 such 
points. 

Numerical optimization to compute the capacity of this channel was initially performed by 
utilizing a mathematical programming package NUOPT [l^ of Mathematical Systems Inc. These 
results, accurate to at most 7-8 significant figures, were further refined by using them as starting 
points in a program to find a critical point of the capacity by applying Newton's method to the 
gradient. The results are shown in Tabled 

T 

To verify that these results give a true 4-state optimum, the function S{r{p{x, y, z))) — ^ ^{p) 
was computed and plotted with ^ = (-0.0396622022,0,-0.9621071440). These results are shown 
in Figure Hand confirm the condition that the hyperplane (^,—1) • {x,y,z,w) = —0.9785055621 
passes through the four points {{xi,yi, Zi, S{T[p{xi,yi, Zi)])) and the condition that the hyperplane 
lies below the surface {x,y, z, S{T[p{x,y, z)]) in R^. (The components (,x,(,y,(,z of ^ are obtained 
by solving the four simultaneous equations S,'^ ■ T{pk) + Co = S{T[p{xi,yi, Zi)]) {k = 1,2,3,4) for 
the variables {Cx,Cyi&i^o)- ) As discussed in Section [7| this is equivalent to a relative entropy 
optimization. 

In addition, the optimal three-state capacity was also computed and shown to be < 0.321461 
which is strictly less than the 4-state capacity of 0.321485. Details for the 3-state capacity can be 
found in Table |21 (Section ISJ. As an optimization problem, the capacity has other local maxima in 
addition to the 3-state and 4-state results discussed above. For example, there are several 2-state 
optima, but these have lower capacity and are not relevant to the work presented here. 
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Scale for interpretation F{x, y, z) = S{T[p{x, y, z)]) — ^'^T[{p{x, y, z)]) 
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Scale for interpretation as H[r{Lj),T{p%^)] 



Figure 1: Depiction of F{x,y,z) = S{T[p{x,y, z)]) — ^ T[{p{x,y, z)]) and relative entropy 
H[T{ui),T{p%^)] = 1.299989 — F{x,y,z) with respect to optimal average output in terms of color 
(or grey scale) on the boundary of the Bloch sphere and its image. 



r[p{x, y, z)] = /)(0.6x + 0.21, 0.601?/, 0.5z + 0.495) capacity = 0.3214851589 

S{T{pi{x, y, z))) - C'^r(/>i) = .9785055621 V i H[T{pi),r{pAv)] = 0.3214851589 V i 



probability 



optimal input (x, y, z) 



0.2322825705 ( 0.2530759862,-0.0000000000, 0.9674464043) 0.127929 

0.2133220819 ( 0.9783950999, 0.0000000000, 0.2067438718) 0.681275 0.0 

0.2771976738 (-0.4734087533, 0.8646461389,-0.1681404376) 0.869870 2.071131 

0.2771976738 (-0.4734087533,-0.8646461389,-0.1681404376) 0.869870 -2.071131 

average ( 0.0050428099, 0.0000000000, 0.1756076944) 

(j), 9 denote the angular coordinates of the optimal inputs 



probability optimal output (x, y, z) S'[r(/9)] 

0.2322825705 (0.1728455917, 0.0000000000, 0.9787232022) 0.0300135405 

0.2133220819 (0.6080370599, 0.0000000000, 0.5983719359) 0.3786915585 

0.2771976738 (-0.2630452520, 0.5196523295, 0.4109297812) 0.5935800377 

0.2771976738 (-0.2630452520, -0.5196523295, 0.4109297812) 0.5935800377 

average (0.0240256859, 0.0000000000, 0.5828038472) 0.7383180644 



Table 1: Data for 4-state channel 

5 Approximation Algorithm to Compute the Holevo Capacity 

To find the 4-state channel given above, the following approximation algorithm was repeatedly 
applied with various parameters. This approximation algorithm is almost sufficient to compute the 
Holevo capacity of a 1-qubit channel in practice. 

Recall that the problem (pQ) is an infinite mathematical programming problem. As far as all 
Xi € D are considered, this infinite set may be regarded as fixed, leaving only pi as variables. The 
objective function is concave with respect to pi, which is quite nice to solve, although the problem 
is still an infinite one. 

For a 1-qubit channel, owing to the concavity of the von Neumann entropy, in the formu- 
lation (P), X can be restricted to a pure state, i.e., + + = 1 in terms of the Bloch 
sphere. The sphere is two-dimensional, and the convex hull of a square mesh of k(k + 1) points 
{sm{9j) cos{pi),sm{6j) sin{pi),cos{6j)) with 9j = jir/k, pi = 21-K/k {j = 0,...,k; I = 0, . . . , k — 1) 
is quite a good polyhedral approximation. For j = 0,k and any I, points become (0,0,1) and 
(0, 0,-1), and the total number of points is fc^ — + 2 (See Fig|21 left). Then, considering the prob- 
lem of type for these k"^ — k -\-2 points with constraints "^i-i^^"^ Pi = 0, > 0, the maximum 
to this (fc^ — k + 2)-dimensional concave maximization problem gives a close lower bound to the 
real maximum of the original problem. 

Interior-point methods can be applied to this high-dimensional concave maximization program- 
ming problem (e.g., |19j^. Computational results from NUOPT are shown in Fig|21 right, from 
which this approximation approach provides values sufficiently close to the Holevo capacity in 
practice. 
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Figure 2: (left) A polyhedral approximation of the sphere for k = 40. It has k'^ — k + 2 = 1562 
points, (right) Approximation values by (/c^ — fc + 2)-point mesh. The horizontal axis is a log 
plot of k, and the vertical axis is a log plot of the difference to the optimum value in bit. A line 
y = 0.05/x^ is drawn for reference. 



6 Heuristic construction of a 4-state channel 

The existence of four state channels of the type found above can be understood as emerging from 
small deformations of 3-state channels with a high level of symmetry. As noted above, a channel 
of the form @ maps the Bloch sphere to an ellipsoid with axes of lengths Ai , A2 , A3 and center 
ti,t2,t3. When ti = t2 = ts, the ellipsoid is centered at the original and the capacity is achieved 
with a pair of orthogonal inputs which map to the endpoints of the longest axis of the ellipsoid. 
However, when some tk are non-zero, this no longer holds and it can even happen that the capacity 
is achieved with a pair of orthogonal inputs which map to the endpoints of the shortest axis (as 
for the example T[p{x,y, z))] = /o(0.55x, 0.55y, 0.5z + 0.5).) By finding parameters which balance 
these situations, 3-state channels were constructed in . 
One of the 3-state channels in is 

T{p{x, y, z)) = p{0.6x, 0.6y, 0.5^ + 0.5) (4) 

which has rotational symmetry about the z-axis of the Bloch sphere. This allows one to analyze 
the problem in two-dimensional plane, but with the limitation that at most a 3-state channel can 
be found. Although the analysis of this channel was performed in the x-z plane, one could, instead, 
choose the optimal inputs to lie any plane containing the z-axis, e.g., the y-z plane. Moreover, if 
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one replaces the two inputs (±0.93681, 0, —0.34984), each with probabihty 0.29885, by any three 
or more states with z = —0.34984 which also average to (0, 0, —0.34984) the capacity is unchanged. 
However, only three inputs are actually necessary to achieve this capacity. 

To find a true 4-state channel, the symmetry must be lowered so that the full 3-dimensional 
geometry of the Bloch sphere is required. The channel (|3J) was obtained as a convex combination of 
an amplitude damping channel with Ai = A2 = ~ 0.707 and a shifted depolarizing channel with 
Ai = A2 = 0.5. Thus, once could expect to make minor changes to Ai and/or A2 without violating 



the CP condition I3 [121 of (Ai ± < 



for channels with ti = t2 = 0. 



Letting Ai = 0.6, A2 = 0.601 gives a channel with reflection symmetry across the x-z and y-z planes. 
Its capacity will require three input states which lie in the y-z plane as shown in Table [3 We now 
wish to further reduce the symmetry by shifting the ellipsoid. To do so, one must first decrease A3 
or t3. We consider the channel 



r[p{x, y, z))] = p{0.6x, 0.601y, 0.5z + 0.495) 



(5) 



which is still CP and requires three input states which lie in the y-z plane as shown in Table |5J 
We now shift the channel in the x-direction and study 



T[p{x, y, z))] = p{0.6x + 0.21, 0.GOly, 0.5z + 0.495) 
The CP condition [12,13] for a channel of the form 

T[p{x, y, z))] = p{0.6x + ti, 0.601y, 0.5z + 0.495) 



(6) 



(7) 



ti 



1.201 



reduces to det(/ — R^R) > where R 



^/(1.995)(0.005) \/(1.995)(1.005) 
-0.001 h 



This gives the quartic 



\^/(0.995)(0.005) ^(0.995){1.005) / 

inequality 0.2805326349 - 101.0098436 tf + 100.2531329 > which holds for \ti\ < 0.05277. . 

Although small enough to satisfy the CP condition, a shift of ti = 0.021 is sufficient to return 
the (restricted) 3-state optimum to the x-z plane across which the image has reflection symmetry. 
In fact, the inputs p{x, y, z) and p{x, —y, z) have the same output entropy. Moreover, replacing all 
inputs Pi{x, y, z) by Pi{x, —y, z) leaves the capacity unchanged. Therefore, either all optimal inputs 
lie in the x-z plane or the set of optimal inputs contains pairs of the form p{x, ity, z) with the same 
probability. (This follows easily from a small modification of the convexity argument in jllj . ) Let 

X[7ri,pi,vr2,/J2,vr3,p3] = S{Y,iT^iPi) - T.i'^iSiPi)- (8) 
For simplicity, assume that yi = 2/2 = 0, but y^ 7^ 0. Let 1:4, = 713 and p4 = p{x^, —ys, Z3). Then 

x[7ri,Pl,Vr2,/)2, ^7r3,p3, i7r3,/34] = 

IxIt^I, Pi, T^2, P2, T^S, P3\ +lx['^l, Pi, T^2,P2, 7^4:, Pi] + S{p) - lS{ ^ 7riPj)-i5( ^ TTiPi) 



j=l,2,3 

= X['^l,Pl,'^2,P2,'n-3,P3\ + S{p) - ^ TriPi)-^S{ ^ TTiPi) 

1=1,2,3 i=l,2,4 

> Xki,pi,vr2,/)2,7r3,p3] = X[7ri,pi,vr2,/)2,7r3,p4] 



j=l,2,4 
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r{p{x, y, z)) = p{0.6x, 0.6y, 0.5z + 0.5) C(r) = 0.324990 



probability 
0.402338 
0.298830 
0.298830 
average 



( 



inputs 

(0.000000, 0.000000, 1.000000) 
(0.936786 cos 0.936786 sin -d, -0.349902) 
-0.936786 cos t?, -0.936786 sin ^, -0.349902) 

(0.000000, 0.000000, 0.193215) 



outputs 
(0.000000, 0.000000, 1.000000) 
(0.562072 cos i9, 0.562072 sin -d, 0.3250492) 
(-0.562072 cos -0.562072 sin 'd, 0.325049) 
(0.000000, 0.000000, 0.596608) 



The ellipsoid is symmetric about the z-axis so that the optimal inputs can be chosen to lie in any 
plane containing the z-axis. For i9 = 7r/2, x = and the optimal inputs lie in the y-z plane; for 
?? = 0, y = and the optimal inputs lie in the x-z plane. 



T(p{x, y, z)) = p{0.6x, 0.601y, 0.5z + 0.5) 



C(r) = 0.325555 



probability inputs 
0.380692 (0.000000, 0.000000, 1.000000) 

0.309653 (0.000000, 0.952435, -0.304740) 

0.309653 (0.000000, -0.952435, -0.304740) 
average (0.000000, 0.000000, 0.191964) 



outputs 
(0.000000, 0.000000, 1.000000) 

(0.000000, 0.572413, 0.347630) 
(0.000000, -0.572414, 0.347630) 
(0.000000, 0.000000, 0.595982) 



The longest axis of the ellipsoid is parallel to the y-axis, and optimal inputs lie in the y-z plane. 



r{p{x, y, z)) = p{0.6x, 0.601y, 0.5z + 0.495) 

probability inputs 

0.146660 (0.000000, 0.000000, 1.000000) 

0.426670 (0.000000, 0.999687, 0.025034) 

0.426670 (0.000000, -0.999687, 0.025034) 

average (0.000000, 0.000000, 0.168022) 



C(r) = 0.320535 

outputs 
(0.000000, 0.000000, 0.995000) 
(0.000000, 0.600811, 0.507517) 
(0.000000, -0.600811, 0.507517) 
(0.000000, 0.000000, 0.579011) 



The longest axis of the ellipsoid is parallel to the y-axis, and optimal inputs lie in the y-z plane. 



T{p{x, y, z)) = p(0.6x + 0.021, O.OOly, Q.5z + 0.495) 



C3(r) = 0.3214609877 



probability inputs 

0.213290 (0.252867, 0.000000, 0.9675017) 
0.366051 (0.978544, 0.000000, 0.206036) 

0.420657 (-0.967649, 0.000000, -0.252299) 

average (0.005083, 0.000000, 0.1756493) 



outputs S'[r(/9)] 

(0.172720, 0.000000, 0.978751) 0.029992 

(0.608127, 0.000000, 0.598018) 0.379029 

(-0.559590, 0.000000, 0.368851) 0.645884 

(0.024050, 0.000000, 0.582825) 0.738297 



A shift in the x-direction offsets the slightly greater length parallel to the y-axis so that the restricted 
3-statc optimization inputs lie in the x-z plane. However, the 3-state capacity is less than that for 
the unrestricted problem which requires four input states. 



Table 2: Optimal 3-state ensembles for various channels 
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where p = nipi + ■K2P2 + 2^3/^3 + = 2 '^iPi + 2 ^iPi- The strict inequahty then 

i=l,2,4 2=1,2,4 

fohows from the strict concavity of S{p). 

To see why one might expect a 4-state optimum with one pair of inputs with ity and two 
with yi = 0, consider the effect of replacing a state of the form {A, 0, B) by a pair of the form 
iA',±{a + b),B') with {A'f = A? - a^, {B'f = B"^ - b^. Recall that increasing the length of 
an output state decreases the entropy and, hence, increases the capacity; moreover this effect is 
greatest when the changes to the output are orthogonal to the level sets + + = const of 
entropy. For our channel, increasing yi with a, b having the opposite sign of A, B will increase the 
contribution of —S[r{pi)] to the capacity. But one must also consider the competing effect of these 
changes on S'[r(/9Av)] for which the net result depends on the geometry of the image. Since T{pAv) 
is near (0,0,0.5), changes in x,y will have little effect on the entropy. However, decreasing z will 
move the average closer to ^/ in a direction near that of greatest increase in entropy. Comparison 
of the results in Tables 1 and 2 shows results consistent with this analysis, but more complex due 
to the various competing effects. Roughly speaking, the input at (—0.967649,0.000000, —0.252299) 
with entropy S'[r(p3)] = 0.645884 splits into the pair of inputs (-0.473409, ±0.864646, -0.168140) 
with output entropy S'[r(/9i)] = 0.593580. However, decreasing \zi\ increases Zi in this case; this is 
offset by changing m = 0.4207 to a pair with pi = 0.2772 increasing the net weight to 0.5544 for 
the states with negative Zi. But the new outputs still have higher entropy than those from inputs 
with positive Zi, The net result is that the average outputs of (0.024050,0.000000,0.582825) and 
(0.024026, 0.000000, 0.582804) are very close for the 3-state and 4-state optima, and the increase 
from 3-state to 4-state capacity is only about 1.5 x 10^^. 

The 4-state channel found in Section 0] is not unique. For example, the channel T{p{x,y,z)) = 
p{0.8x + 0.22, 0.8015?/, 0.75z + 0.245) also requires 4-states to optimize capacity. In view of the 
discussion above it is reasonable to expect that one can find a family of 4-state channels which 
have the form T{p{x,y, z)) = p{Xix + ei,(Ai + e2)y,X3Z + t^) with suitable small constants, 
''^s + *3 = 1 — and Ai > A3 chosen so that T{p{x, y, z)) = p{\ix, Xiy, A32: + ^3) is close to a 3-state 
channel. 

In the class of channels above, one always has t2 = 0, which raises the question of whether 
or not there exist 4-state channels exist with all all non-zero. Therefore, maps of the form 
T{p{x,y,z)) = p(0.6x + 0.021, 0.601y + t2,0.5z + 0.495) were considered with t2 7^0. With ^2 < 0.48 
such maps are completely positive and the channel with with t2 = 0.00005 was shown to require 
four inputs to achieve capacity. 

7 Equivalence to a relative entropy optimization 

Reformulation of the capacity optimization in the dual form (jJJ was also used by Audenaert and 
Braunstein ^ and by Shirokov j25j to obtain theoretical results and plays an important role in 
Shor's proof of equivalence of additivity questions. The implication that the optimal outputs 
for the capacity then define a supporting hyperplane for the output entropy function S'[r(p)] can 
also be reformulated in terms of relative entropy. 

The relative entropy is defined as H{uj,p) = Tr uj (log uj — logp). It then follows that 

Sip) -Y^TT.Sip,) = Y,^,H{p,,p) (9) 

i i 
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Plot of H[T(uj{cos 9 sin (j), sin ^ sine/), cos r(pi^)]. 




Detailed view of the dark "ridge" near (p = j showing 3 distinct maxima and saddle points. 

Figm'e 3: Plots of relative entropy of output states with respect to the optimal average output as 
a function of a pair of angles defining pure input states on the surface of the Bloch sphere. The 
edges 9 = and 9 = 2tt meet on the sphere, so that the figures show two halves of the two maxima 
with y = 0, one near the north pole and one on the ridge. 
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and 

c(r) = sup sup I y'7rii?[r(/3j),r(p)] : Y^iiTiPi = p,TTi> o,Y^iiTi = i\. (lo) 

Moreover, for any fixed p, and any iri,pi 

Y,^iH[^iP^),^ip)] < C{T) < supH[Tiu;),r{p)]- (H) 

I 

In fact, it was shown in 16 and j24j that 

C7(r) = infsupF[r(a;),r(/9)l (12) 

from which it follows that when /jav is the optimal average input, C(r) = if[r(/?j), r(pAv)] for all i. 
Thus, a necessary condition that an ensemble £ = {vr, pi} achieve the capacity is that all outputs 
V{pi) are "equidistant" from the average output ^{Ylii'^^iPi) ™ the sense that H\r{pi),T{pAv)] is 
independent of i. 

The 4-state optimal ensemble satisfies this requirement, and H\r[pi),T{p\^)\ = 0.321485159 
for all i. If, instead, the 3-state ensemble for the same channel (i.e., the last reported in Table E)) 
is used, one finds that H\r{p^),T{p\^)] = 0.321460988 V i so that these states also satisfy the 
equi-distance requirement. However, as one can see from Table 01 

sup/7[r(w),r(pL)] > 0.3215 > H[T{pi),T{pl^)] 

showing that the 3-state ensemble is not optimal. Indeed, a plot of H\r{ijo),T{p)] as shown in 
Figure 0J shows four relative maxima, which lie closer to the 4-state inputs, than to the 3-state 
inputs for which yi = 0. The supremum appears to be achieved for a pair of states with (x, y, z) = 
(—0.539291, ±0.822613, —0.180202). Thus, the relative entropy criterion seems to anticipate the 
splitting of the input near (—0.97, 0, —0.25) into a pair of inputs near (—0.47, ±0.86, —0.17). 

The relative entropy can also be used to check additivity without need to carry out the full 
variation in 1)121) . In fact, applying (|11() to the product channel F (8) F gives 

2C(F) < C{T ®T)< supi/ [(F ® T){uj) , T{pi^) ® r(pL)] . (13) 

If the supremum on the right equals 2C(r), then the channel is additive. Furthermore, the supre- 
mum restricted to product inputs equals 2C(F). Therefore, if the supremum is strictly greater than 
2C(F), it must be attained for a pure entangled state uj. But this would imply that the optimal 
average input is not a product and, hence, that F is superadditive. Thus, to determine whether or 
not additivity holds, it is enough to study the supremum in (|TT?|) for the product input p\^ p\^\ 
it is not necessary to find the optimal inputs for the product channel. 

In order to reformulate the relative entropy optimization in terms of a hyperplane condi- 
tion, we introduce some notation and review some elementary facts. First, recall that Th: B = 
"Ylijk "(^jkbjk = a • b where a, b denote vectors with components ajk and bjk respectively. Alter- 
natively, let {Mk}i^^Q i ^2_i be an orthonormal basis of d x d matrices with TrMjMk = Sjk and 
Mq = Then an arbitrary matrix A can be written as j4 = 'Ylk^k^k with = TrM^A, and 
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for images of the two hemispheres of the Bloch sphere, left: x > right: x < 



Scale: 



0.304 0.306 0.308 0.31 0.312 0.314 0.316 0.318 0.32 



Figure 4: Relative entropy H[r{uj),T{p'l^)] with respect to the 3-state average output T{p^^) for 
image states r(a;). Note that this figure is almost indistinguishable from Figure ^ However, the 
actual locations and values are slightly different as seen by comparing the values in Table El below 
with those in Tabled 



uj{x,y,z) 
(0.252867, 0.000000, 0.967501) 
(0.978544, 0.000000, 0.206036) 
(-0.539291, 0.822613, -0.180202) 
(-0.539291, -0.822613, -0.180202) 



0.321460988 
0.321460988 
0.321505535 
0.321505535 



0.321460986 
0.321460981 
0.321504592 
0.321504592 



Table 3: Relative maxima of relative entropy with respect to the 3-state piv for 4-state channel 
T{p{x,y,z)) = p{O.Qx + 0.021, 0.601?/, 0.5z + 0.495). The relative entropy for the nearest 4-state 
input is also given for comparison. 
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Ti a'' B = Y^j^akPk- A familiar example of such a basis for 2 x 2 matrices is {^ctq, \(Ji, \(J2: ^<^3} 
where (Tq denotes the identity /. An example for 4x4 matrices is {\(yk'yk} j,k=o. 1,2, ■i We will be 
primarily interested in basis and matrices which, like the two examples above, are self-adjoint; there- 
fore, we drop the adjoint symbol ^ and assume the coefficients are real. For a density matrix p 
we will let [5{p) be the vector associated with the trace zero part of p so that p = + Ylk Pk^h- 
Using the Pauli basis for qubits, /3(p) is simply the vector with components (x, y, z) in associated 
with the Bloch sphere. 

Now let F[p) = S'[r(/9)] — ^ • T{p) with ^ defining a supporting hyperplane for the capacity 
optimization as discusses in Sectional and let G{p) = H\r{p),T[pAw)] with pAv the optimal average. 
Writing logr(pAv)) = J^k'^k^k, one finds 

G(p) = H[r{p),r{pA.)] = -s[r{p)] - Trr(p) iog(r(pAv)) 

= -s[rip)]-To-T.mp))- (14) 

Therefore, H[r{p),r{pAv)] + S[T{p)] defines a hyperplane and G{p) + To < C{T) holds with equality 
for the optimal inputs pi. This implies that the supporting hyperplane condition F{p) > A holds 
with equality for optimal inputs pi when ^ = — r. In that case, F{p) = —G{p)—tq and A = tq—C{T). 
With d+1 optimal inputs, the supporting hyperplane is the unique hyperplane given by the relative 
entropy. 

For the 4-state channel ^ = (0.039662, 0, 0.962107) and we see from Table 1 that A = 0.978506 
and B = 0.321485. A computation gives log(r(pAv)) = 1.299989/ + 0.039662cra; 0.962105(TZ, from 
which it follows immediately that r = and F{p) = 1.299989 — G{p) as expected. 



8 Additivity 

As mentioned earlier, 4-state channels might be good candidates for examining the additivity of 
channel capacity. Those considered here have the property A2 > maxj=i^3 |Ai|, t2 = and ti, ts / 0. 
Channels of this type do not belong to one of the classes of qubit maps for which multiplicativity of 
the maximal p-norm has been proved and its geometry seems resistant to simple analysis. (See jJOl 
for a summary and further references.) Because one state lies very close the the Bloch sphere, with 
all others much further away, one expects that additivity of minimal entropy and multiplicativity of 
the maximal p-norm surely hold for this channel. Nevertheless, this has not been proven, suggesting 
that the channel may have subtle properties. Indeed, most known proofs of additivity for minimal 
entropy for a particular class of channels, also yield additivity of channel capacity for the same 
class. These conjectures are now known to be equivalent |27j) but this equivalence requires the use 
of non-trivial channel extensions and does not hold for individual channels. Thus the resistance to 
proof of of a seemingly obvious fact using current techniques may indicate that the far less obvious 
additivity of channel capacity does not hold. 

We will use the fact that F is additive if sup^ G{uj) = 2C(F), but superadditive if G{io) > 2C(F) 
for some state w where G(w) = i/ [(F®F)(w) , F(/>t^)0F(/)i^)] . The function ^(/o) = H[T{p),T{p'i^)] 
has 10 critical points(4 maxima, 4 saddle points, and 2 (relative) minima), as shown in Figure El 
This implies that G{uj) has at least 100 critical points, 16 maxima, 4 (relative) minima, and 80 
saddle-like critical points when one restricts w to a product state. The complexity of this landscape 
seems greater than that of any other class of channels studied. If the capacity of any qubit channel 
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is non-additive, it seems likely that it would be a channel of this type. Therefore, a thorough 
numerical analysis is called for. Unfortunately, the large number of critical points, also make a full 
optimization very challenging. 

It suffices to optimize over pure states of the form uj = with 
i,T,\ c°s6'„ \ / cos6l^ \ r /e~'<^"sin6l„\ /e-^'^^sinei^X 



and p G [0,1], Qu-, ^v, ^ £ [0,27r], 0^, G [0, f ]. To see why this is true, note that (fT3)) says 
that 1^') = ^\u) <8> \v) + e^^ \/\ — p \u-^) \v-^) where \u-^) denotes the vector orthogonal to 

1^) = I 7«f°'^^/iV Note that 
' sm J 

\u) {u\ = ^[l + sin 26 cos (pax + sin 26 sin (pay + cos 29 az] ■ 
Now let 'ju = T{\u){u\). Then we can write 



(r®r)(|^')(1'|) =p7,®7, + (l-p)7„x®7.x + ^p(l-p)X (16) 

where 

X = e-'T{\u){u^\) r{\v){v^\) + e'T{\u^){u\) ® r(|?;^)(?;|) 
Since Tr = {u-^\u) = and F is trace-preserving, the partial traces of X are zero, i.e. 

TriX = Tr2X = 0. (17) 

It then follows immediately that Tr X log gi<^ Q2 = since 

TrX/i ®log^2 + TrX(log^i) 0/2 = Tr2 [log ^2(Tri X)] + Tri [log ^i(T¥2X)] =0 (18) 

Applying this with g = T(p\^) one finds that 

Tr (r ® nd'f )(^'|) logr(/>L) T{pt) = (19) 

Therefore the second term in the relative entropy is affine in p. Hence any non-linearity in 
i?(r®r)(|^')(*|),r(/>iJ(g)r(pD must come entirely from the entropy term -S'[(r(g)r)(|^)(^'|)] . 

Because of the difficulty of optimizing over all six parameters, plots of G{uj) were made as a 
function of only p, v with n, v fixed and as a function of p with the remaining 5 parameters fixed. 
A typical example is shown in Figure [S] and appears to be convex function in p for several choices 
of nu. Many other examples were considered with ti, v both corresponding to optimal inputs, n, v 
chosen randomly, n, v chosen to be highly non-optimal, and various combinations of these. The 
shape of the curve seems to be extremely resilient for all inputs in Schmidt form (|15j) and suggests 
convexity in p with a deep minimum. Although the minimum lies above that for the corresponding 
mixed state with X = 0, it is well below both endpoints. Changes as v ranges from to 2tx are 
small. 

States of the form ■^(I'Uj) \uj) -|- e^'^lufc) ® \ui)) with m corresponding to the four optimal 
inputs were also considered. Because these Ui are not orthogonal, the functions do not have the 
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Figure 5: Typical plot of G{uj) = H[{<^ ® <I>)(a;), <I>(/9av) ® <1>(/0av)] as of function of p ioi u = 
0, ^,7r, ^ using pure states of the form (fT3|) and u,v fixed and e^^ = 1,?,— i,— 1. Endpoints 
correspond to product states and p = 0.5 maximally entangled . 

form H15|) and (|19() need not hold. Although the relative entropy has a slightly different shape as a 
function of p and v, it still lies below the plane 2C(r) and has a deep minimum. 

Thus, there seems to be little room for obtaining a counter-example by varying the channel 
parameters. This may give the strongest numerical evidence for additivity yet, at least in the case 
of qubit channels. 

Remark: Because the second term in the relative entropy is affine in p for states of the form (|15|) . 
the concavity of the entropy function gu,v,u{p) = 5'[(r(8'r)(|^')(^|)] as a function oi p for arbitrary 
states of the form form (|15() . This would immediately yield both additivity of minimal entropy and 
of channel capacity. It is very tempting to conjecture that gu,v,uip) is concave. 

A similar conjecture was made independently in [3, with supporting evidence for a particular 
set of channels with d > 3. Despite the appeal of this conjecture, it is false. Consider the channels 
T[p{x,y,z)] = p{nx,fiy,0.5x) with < ^ < 0.75 and IV') = ^/p\00) + ^T^lll). Then T O 
r)(|^)(^|) has eigenvalues ^, ^, 5±Vi+(16/^"-4 M1zp) ^ ^^^1^^^ ^^^^ j^^^ = S[{T ^ r)(|V')(V|)] 
is concave for p < and convex for /i > as shown in Figure IHl This example above also implies 
that a related conjecture |H] for Schur concavity is false. Note however, that the chosen inputs are 
not optimal when fi > ^ and far from optimal when A* > indeed even the lowest point on 

convex curve shown lies well above the true minimal output entropy of 1.2017521 for /i = and 
1.087129 for n = 0.75. 

If products of the optimal inputs -TsdO) ± |1)) are entangled, the corresponding entropy function 
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is known ^ to be concave. Moreover, King has [Sj shown that both the minimal entropy and the 
capacity are additive for these channels for all /x. 

It seems likely that the conjectured concavity holds when optimal inputs are entangled; however, 
this is not sufficient to prove additivity of either capacity or minimal entropy. 
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Figure 6: Plots of f{p) = S[{r (g> T){\ip){ip\)] for /i = 0,0.5,0.707,0.75 with r[p{x,y,z)] = 
^y, 0.5x) and = y^|00) + ^/l — p The top curve with /x = reduces to the usual 
concavity of the mixed state (F ® r)(p|00)(00| + (1 — the next with fi = 0.5 shows the 

expected concavity; the flat horizontal curve is for /i = 0.707, or -v/2; the bottom curve shows 
H = 0.75 for which the inputs are no longer optimal and f^{p) is convex. 
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